library(bayesrules)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.4 ✓ dplyr 1.0.7
## ✓ tidyr 1.1.3 ✓ stringr 1.4.0
## ✓ readr 2.0.1 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
In each situation, tune Beta that reflects the prior info. We want to use the following formulas to tune Beta: \(E(\pi) = \frac{\alpha}{\alpha + \beta}\)
Using this formula, I will try to derive the following do some guessing and checking to figure out what values make sense as alpha and beta.
0.4 = alpha / (alpha + beta)
alpha = 2/3 * beta
Using this ratio, I’ll guess and check a few values:
plot_beta(40,60)
This looks close to what we want.
Using the same formula as above give:
alpha = 4 * beta
plot_beta(80,20)
If we were to take this, the variance would be:
(20 * 80) / ((20 + 80)^2 * (20 + 80 + 1)) = 0.00158
This is too low for the variance we were given (0.05).
Trying another set of values:
plot_beta(60,15)
This gives a variance of 0.0021. This is closer but still a ways off.
Trying a new value:
plot_beta(4,1)
This gives 0.027. So lower is better. Trying another one!
plot_beta(2,0.5)
This gives variance of 0.0457 which is close enough!!
Using the above formula:
alpha = 9 * beta
Trying some values here:
plot_beta(9,1)
This looks to give the range we want.
So this makes me think they have no clue their chances and the variance is very large. I would think this looks like a pretty fat bell curve:
plot_beta(5,5)
Tune appropriate Beta prior model for each situation:
alpha = 4 * beta
Now I’m going to use the method from class to get the right beta using the ratio above:
p <- 0.8
n <- 100
quantile(rbeta(10000,p*n,(1-p)*n),c(0.05,0.95))
## 5% 95%
## 0.730684 0.862966
This gives 73% to 86% which is close but not quite there. Trying another n:
p <- 0.8
n <- 75
quantile(rbeta(10000,p*n,(1-p)*n),c(0.05,0.95))
## 5% 95%
## 0.7207177 0.8706077
This gives 72% and 87%. Closer still:
p <- 0.8
n <- 50
quantile(rbeta(10000,p*n,(1-p)*n),c(0.05,0.95))
## 5% 95%
## 0.7015914 0.8839681
This gives 70% and 89% which is close enough! So the Beta is: Beta(40,10). I’m going to check this against the plot beta:
plot_beta(40,10)
alpha = 9 * beta
Trying Beta(.9,.1) gave me a variance of 0.045 which is still too low. Trying Beta(0.45,0.05) gave me a variance of 0.06 which is closer but still too low.
Trying Beta(0.18, 0.02) gives me variance 0.075 which is close enough!
plot_beta(0.18,0.02)
alpha = 5.67 * beta
plot_beta(119,21)
Looks great! Nice range that looks to match.
So the mean is around 30 and the range is pretty wide.
alpha = 3/7 * beta
plot_beta(6,14)
This provides as plot that fits those characteristics.
You want to specify a Beta prior for a situation where the parameter is unknown. You think its equally likely to be anywhere between 0 and 1.
This sounds like a uniform distribution, which has a Beta(1,1)
plot_beta(1,1)
The mean we can get from our formula:
\(E(\pi) = \frac{1}{(1 + 1)} = 0.5\)
This mean of 0.5 makes sense with having no clue. Value will average out to the middle.
The standard deviation we can get from our formula as well:
\(Var(\pi) = \frac{1*1}{((1+1)^2 * (1+1+1))} = \frac{1}{(4 * 3)} = \frac{1}{12}\)
\(SD(\pi) = \frac{1}{12}^{0.5} = 0.2886751\)
A Beta that has the same mean and a smaller SD is Beta(2,2)
plot_beta(2,2)
A Beta with the same mean and a larger SD is Beta(0.5,0.5)
plot_beta(0.5,0.5)
Match the Betas with their graphs:
Match the Betas with their graphs
Looking at the beta models from Question 3.4
Plot c - Beta(6,2) - has the biggest mean
Plot e - Beta(0.5,6) - has the smallest mean
mean_c <- 6 / (6 + 2)
mean_e <- 0.5 / (0.5 + 6)
mean_c
## [1] 0.75
mean_e
## [1] 0.07692308
plot_beta(6,2)
plot_beta(0.5,6)
The formula for the mode is: \(Mode(\pi) = \frac{\alpha - 1}{\alpha + \beta - 2}\) when and are over 1.
mode_a <- (-.5)/(-1)
mode_b <- 1 / (2)
mode_c <- 5 / 6
mode_d <- 0
mode_e <- -.5 / 4.5
mode_f <- 5/10
mode_a
## [1] 0.5
mode_b
## [1] 0.5
mode_c
## [1] 0.8333333
mode_d
## [1] 0
mode_e
## [1] -0.1111111
mode_f
## [1] 0.5
Graph c (Beta(6,2)) has the largest mode. Graph e (Beta(0.5,6) has the smallest.
This is giving me the value we got from the Bayes Rules package before reloading it in class. I believe that the mode formula given in the text only works for alpha and beta over 1
Using the variance formula and then square-rooting it will give us the SD for each of these:
\(Var(\pi) = \frac{\alpha\beta}{(\alpha + \beta)^2 * (\alpha + \beta + 1)}\)
SD_a <- sqrt(0.25 / (1 * 2))
SD_b <- sqrt(4 / (16*5))
SD_c <- sqrt(12 / (64*9))
SD_d <- sqrt(1 / (4*3))
SD_e <- sqrt(3 / (42.25 *7.5))
SD_f <- sqrt(36 /(144*13))
SD_a
## [1] 0.3535534
SD_b
## [1] 0.2236068
SD_c
## [1] 0.1443376
SD_d
## [1] 0.2886751
SD_e
## [1] 0.09730085
SD_f
## [1] 0.138675
The largest SD is for graph a, Beta(0.5,0.5); The smallest SD is for graph e, Beta(0.5,6)
Use plot_beta and summarize_beta to confirm answers to exercise 3.6
plot_beta(0.5,0.5)
summarize_beta(0.5,0.5)
## mean mode var sd
## 1 0.5 0 and 1 0.125 0.3535534
plot_beta(2,2)
summarize_beta(2,2)
## mean mode var sd
## 1 0.5 0.5 0.05 0.2236068
plot_beta(6,2)
summarize_beta(6,2)
## mean mode var sd
## 1 0.75 0.8333333 0.02083333 0.1443376
plot_beta(1,1)
summarize_beta(1,1)
## mean mode var sd
## 1 0.5 NaN 0.08333333 0.2886751
plot_beta(0.5,6)
summarize_beta(0.5,6)
## mean mode var sd
## 1 0.07692308 0 0.009467456 0.09730085
plot_beta(6,6)
summarize_beta(6,6)
## mean mode var sd
## 1 0.5 0.5 0.01923077 0.138675
This confirms that the largest and smallest mean is from charts (c) and (e) respectively. The largest and smallest mode is from charts (c) and (e) respectively. The largest standard deviation and smallest is from charts (a) and (e) respectively.
Let \(\pi\) be the proportion of US residents that prefer the term ‘pop.’ Two different beverage salespeople from different regions have different priors for \(\pi\). The first from North Dakota specifies Beta(8,2) as prior. The second in Louisiana specifies Beta(1,20) as prior.
I’ll do this using the following formulas:
\(E(\pi) = \frac{\alpha}{\alpha + \beta}\) \(Mode(\pi) = \frac{\alpha - 1}{\alpha + \beta - 2}\) \(Var(\pi) = \frac{\alpha\beta}{(\alpha + \beta)^2 * (\alpha + \beta + 1)}\)
mean_ND <- 8 / (8 + 2)
mean_L <- 1 / (1 + 20)
mode_ND <- (8 - 1) / (8 + 2 - 2)
mode_L <- (1 - 1) / (1 + 20 -2)
sd_ND <- ((8*2) / (((8+2)^2) * (8 + 2 + 1)))^0.5
sd_L <- ((1*20) / (((1 + 20)^2) * (1 + 20 + 1)))^0.5
mean_ND
## [1] 0.8
mean_L
## [1] 0.04761905
mode_ND
## [1] 0.875
mode_L
## [1] 0
sd_ND
## [1] 0.1206045
sd_L
## [1] 0.04540298
So we have the mean, mode and sd for North Dakota as 0.8, 0.875 and 0.12 respectively. For Louisiana the mean, mode, and sd is 0.048, 0, and 0.045 respectively.
ND_soda <- plot_beta(8,2) + labs(title="ND Prior")
L_soda <- plot_beta(1,20) + labs(title="Louisiana Prior")
ND_soda
L_soda
The North Dakota salesperson has an understanding that a large percentage of people say pop. The density is very large for \(\pi\) above 80%. By contrast, the Louisiana salesperson believes that very few people say ‘pop.’ The density is highest from 0% to about 20% and then tappers off.
Now we poll 50 US residents and 12 prefer the term ‘pop’
This puts us in the land of beta-binomial. Here the Y = 12 and n = 50.
To do this by hand first, I would use the bayes rules formula for North Dakota beta:
\(f(\pi|y=12) = \frac{\gamma(8 + 2)}{\gamma(8))*\gamma(2)} * \pi^8 (1-\pi)^2 * {50 \choose 12} * \pi^{12} * (1-\pi)^{38}\)
We can drop the constants and this simplifies to: $ = c * ^{20} * (1-)^{40}$
This would be equivalent to a posterior kernel with \(\alpha\) = 19 and \(\beta\) = 39
Let’s try and confirm with R:
summarize_beta_binomial(alpha = 8,beta=2,y=12,n=50)
## model alpha beta mean mode var sd
## 1 prior 8 2 0.8000000 0.8750000 0.014545455 0.12060454
## 2 posterior 20 40 0.3333333 0.3275862 0.003642987 0.06035716
Confused why this didn’t have to be minus 1. But we’ll take these values for the posterior.
Now for Louisiana. I’ll take the posterior the same way and check it:
\(f(\pi|y=12) = \frac{\gamma(1 + 20)}{\gamma(1))*\gamma(20)} * \pi^1 (1-\pi)^{20} * {50 \choose 12} * \pi^{12} * (1-\pi)^{38}\)
We can drop the constants and this simplifies to: $ = c * ^{13} * (1-)^{58}$
This should give \(\alpha\) = 13 and \(\beta\) = 58 according to the last try. Now to plot it:
summarize_beta_binomial(alpha=1,beta=20,y=12,n=50)
## model alpha beta mean mode var sd
## 1 prior 1 20 0.04761905 0.000000 0.002061431 0.04540298
## 2 posterior 13 58 0.18309859 0.173913 0.002077410 0.04557861
ND_pop <- plot_beta_binomial(alpha = 8,beta=2,y=12,n=50) + labs(title="ND pop")
L_pop <- plot_beta_binomial(alpha=1,beta=20,y=12,n=50) + labs(title="Louisiana pop")
ND_pop
L_pop
The North Dakota salespeople priors were way off from the observed data. So their posterior has shifted significantly to the left of their prior and their posterior is still skewed right from the observed data.
The Louisiana data was lower than the observed data but closer to that total so the shift is less dramatic between posterior and prior, though still left skewed.
A university wants to know what proportion of students are regular bike riders, \(\pi\), so that they can install an appropriate number of bike racks. Since the university is in Cali, staff think that \(\pi\) has a mean of 1/4 and mode of 5/22.
\(\frac{\alpha}{\alpha + \beta} = 1 /4\)
\(\frac{\alpha - 1}{\alpha + \beta - 2} = 5 /22\)
Using these two formulas, I find that \(\alpha\) = 6 and \(\beta\) = 18.
Beta(6,18)
plot_beta(6,18)
Using R:
summarize_beta_binomial(alpha=6,beta=18,y=15,n=50)
## model alpha beta mean mode var sd
## 1 prior 6 18 0.2500000 0.2272727 0.007500000 0.08660254
## 2 posterior 21 53 0.2837838 0.2777778 0.002710007 0.05205773
This gives a posterior = Beta(21,53).
I’m going to plot them all as well:
plot_beta_binomial(alpha = 6,beta=18,y=15,n=50)
mean = 0.284 mode = 0.278 sd = 0.0521
The posterior is pretty split between the prior and the data. It looks a bit more like the incoming data. The certainty of the university was modest and the survey data n was also modest, so the posterior was pretty balanced between the two.
A 2017 survey found that 10.2% of LGBT adults in the US were married to a same-sex spouse. Now its 2020, and Bayard guesses that \(\pi\), the percent of LGBT adults in the US who are married to a same-sex spouse most likely increased to about 15% but could range from 10% to 25%
For this we’re going to take the mean to be 0.15. Using the following formulas I’ll find an \(\alpha\) and \(\beta\):
\(\alpha\) = .15/.85 * \(\beta\)
Trying it:
plot_beta(3,17)
This doesn’t quite give a range we want. Now I’ll try a different value:
plot_beta(12,68)
This looks closer to the range we want.
So this provides a y = 30 and n = 90. We can use R to calculate the posterior model:
summarize_beta_binomial(alpha = 12,beta=68,y=30,n=90)
## model alpha beta mean mode var sd
## 1 prior 12 68 0.1500000 0.1410256 0.001574074 0.03967460
## 2 posterior 42 128 0.2470588 0.2440476 0.001087841 0.03298243
This provides the posterior mean, mode, and standard deviation, but I can also do this ‘by hand’:
#mean
42 / (42+128)
## [1] 0.2470588
#mode
41 / (42+128 -2)
## [1] 0.2440476
#SD
sqrt((42*128) / (((42+128)^2) * (42 + 128 + 1)))
## [1] 0.03298243
Mean = 0.247 Mode = 0.244 SD = 0.033
\(Mode(\pi) = \frac{\alpha - 1}{\alpha + \beta - 2}\) \(Var(\pi) = \frac{\alpha\beta}{(\alpha + \beta)^2 * (\alpha + \beta + 1)}\)
plot_beta_binomial(alpha = 12,beta=68,y=30,n=90)
The posterior model looks a bit closer to the data than the prior information. The incoming data is from a decent sized n, but the prior was pretty strong. Overall the posterior is pretty split between the two, weighing each.
In 2016, Pew found that 30% of US adults are aware that they know someone who is transgender. In the 2020s, Sylvia believes that the current percentage has increased to between 35% and 60%.
I’m going to assume an even distribution between these values, so Sylvia’s proposed mean is about 47.5%
\(\alpha\) = .3 / .7 * \(\beta\)
Using the class method:
p <- 0.475
n <- 100
quantile(rbeta(10000,p*n,(1-p)*n),c(0.05,0.95))
## 5% 95%
## 0.3930289 0.5567674
This gives a range between 39% and 55% which is close but I want it to be broader so I’ll decrease n:
p <- 0.475
n <- 50
quantile(rbeta(10000,p*n,(1-p)*n),c(0.05,0.95))
## 5% 95%
## 0.3602770 0.5898979
This gives a range of 36% and 59% which is close enough! This provides a Beta of Beta(23.75,26.25). I’ll plot it below:
plot_beta(23.75,26.25)
Now we can estimate the posterior model using the formula I found that is derived from the likelihood formula:
\(\alpha\)_post = \(\alpha\) + y \(\beta\)_post = \(\beta\) + (n-y)
\(\alpha\)_post = 23.75 + 80 = 103.75 \(\beta\)_post = 26.25 + (200 - 80) = 146.25
I’ll test that this matches and I’ll plot the posterior:
summarize_beta_binomial(alpha = 23.75,beta=26.25,y=80,n=200)
## model alpha beta mean mode var sd
## 1 prior 23.75 26.25 0.475 0.4739583 0.0048897059 0.06992643
## 2 posterior 103.75 146.25 0.415 0.4143145 0.0009672311 0.03110034
plot_beta_binomial(alpha=23.75,beta=26.25,y=80,n=200)
This gives me the same posterior values for alpha and beta.
mean = 0.415 mode = 0.414 SD = 0.0311
The posterior is quite different from the prior here. The prior was moderately confident, but we got a fair amount of incoming data, so the data overwhelmed the prior to generate the posterior model.
To find the y and n values for this, I’ll use the formula below:
\(f(\pi|y) = \frac{\gamma(\alpha + \beta)}{\gamma(\alpha))*\gamma(\beta)} * \pi^{\alpha} (1-\pi)^{\beta} * {n \choose y} * \pi^{y} * (1-\pi)^{n-y}\)
When you simplify, the posterior Beta values can be taken by:
\(\alpha\)_post = \(\alpha\) + y
\(\beta\)_post = \(\beta\) + (n-y)
Taking this rule, y = 11 - 2 = 9, and n = 24 - 3 + 9 = 30. Let’s check if these get us the posterior:
summarize_beta_binomial(alpha = 2,beta=3,y=9,n=30)
## model alpha beta mean mode var sd
## 1 prior 2 3 0.4000000 0.3333333 0.040000000 0.20000000
## 2 posterior 11 24 0.3142857 0.3030303 0.005986395 0.07737179
Ayeee!! That works!!!
Using the same rules as above:
y = 100 - 1 = 99 n = 3 - 2 + 99 = 100
summarize_beta_binomial(alpha = 1,beta=2,y=99,n=100)
## model alpha beta mean mode var sd
## 1 prior 1 2 0.3333333 0.000000 0.0555555556 0.23570226
## 2 posterior 100 3 0.9708738 0.980198 0.0002719027 0.01648947
Amaze.
The prior has a high density at a very high \(\pi\). This shows that the prior model is pretty confident that values will be around 90% or higher.
The likelihood function has a lower density but at a radically lower value of \(\pi\). This is saying that the incoming data gives modest confidence that the value lies around 20%.
The posterior model shows modest confidence that the value lies around 75%. The posterior more closely agrees with the prior. Given how confident the prior was, it has a heavier weight on the posterior model, but given how different the prior and the likelihood are in terms of their place on the graph, the posterior still falls closer to 0 than the prior values around 90%.
I would need a prior with a high mean and a small likelihood n that produces a ratio of around 20%.
To get this, I might use 90% as mean for the prior and 20% as the mean for the likelihood.
\(\alpha\) = 9 * \(\beta\)
y / n = .2
\(\alpha\) = 900
\(\beta\) = 100
y = 2
n = 10
Let’s see how these values work
plot_beta_binomial(alpha = 900, beta = 100, y = 2,n=10)
Okay, lets up the mean close to 1 for the prior and give a bit more certainty for the incoming data. Also lowering the certainty for the prior so the density gets closer to 30:
alpha = 99 * beta
alpha = 49.5
beta = 0.5
y = 4
n = 20
plot_beta_binomial(alpha=49.5,beta=.5,y=4,n=20)
This looks pretty close to the graph provided.
The prior conveys little certainty about where the data will lie. It has a mean of 0.5 and the distribution is very wide, spanning basically all of the possibly values of \(\pi\).
The likelihood function has more confidence in values of \(\pi\) closer to 10%. It’s pretty confident the value is between 0% and 25%.
The posterior is very close to the data / likelihood. It also gives a mean around 12%-15% and a range between about 0% and 30%. This makes sense because we had very little information from the prior.
With a mean of 0.5, alpha and beta are equal.
If we make the mean of the data around 10%, we can try values for y and n below that match that:
alpha = 3
beta = 3
y = 3
n = 30
plot_beta_binomial(alpha = 3,beta=3,y=3,n=30)
That actually looks pretty good, except the density can be higher for the likelihood, so I’ll up it a bit:
plot_beta_binomial(alpha=3,beta=3,y=4,n=40)
That looks close to me!
summarize_beta_binomial(alpha = 3,beta=3,y=30,n=40)
## model alpha beta mean mode var sd
## 1 prior 3 3 0.5000000 0.5000000 0.035714286 0.1889822
## 2 posterior 33 13 0.7173913 0.7272727 0.004313639 0.0656783
plot_beta_binomial(alpha=3,beta=3,y=30,n=40)
summarize_beta_binomial(alpha=3,beta=3,y=15,n=20)
## model alpha beta mean mode var sd
## 1 prior 3 3 0.5000000 0.5000000 0.035714286 0.18898224
## 2 posterior 18 8 0.6923077 0.7083333 0.007889546 0.08882312
plot_beta_binomial(alpha=3,beta=3,y=15,n=20)
Both Patrick and Harold’s posteriors look fairly similar. Patrick’s has a slightly higher mean and mode. Harold’s model has a higher standard deviation. The similarities are because both of their surveys provide the same rate of success, and the differences are because Patrick had a larger sample size for his data so there is a higher certainty of success.